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Abstract 

Background: Cytotoxin associated gene product A (CagA) is an oncogenic protein secreted by tlie gastric 
bacterium Helicobacter pylori. Internalization of CagA by hunnan epithelial cells occurs by an unknown mechanisnn 
that requires interaction with the host nnennbrane lipid phosphatidylserine. 

Findings: Local homology at the level of amino acid sequence and secondary structure has been identified 
between the membrane-tethering region of CagA and the lipid-binding Fes-CIP4 homology-Bin/Amphiphysin/Rvs 
(F-BAR) domains of eukaryotic proteins. The F-BAR proteins are major components of the endocytic machinery. In 
addition to the membrane-binding F-BAR domains, they contain other domains that interact with actin-regulatory 
networks and mediate interplay between membrane dynamics and cytoskeleton re-arrangements. Positively 
charged residues found on the lipid binding face of the F-BAR domains are conserved in CagA and represent 
residues involved in CagA binding to lipids. 

Conclusions: The homologies with F-BAR proteins extend to lipid binding specificities and involvement in 
reorganization of the actin cytoskeleton. CagA and F-BAR domains share binding specificity for phosphatidylserine 
and phosphoinositides. Similar to the F-BAR proteins, CagA has a membrane-binding module and a module that 
shares structural homology with actin-binding proteins, and, like eukaryotic F-BAR domain proteins, CagA function 
is linked to actin dynamics. The uncovered similarities between the bacterial effector protein and eukaryotic F-BAR 
proteins suggest convergent evolution of CagA towards a similar function. 

Keywords: Cytotoxin-associated gene product A, Lipid binding, Membrane tethering. Lipid specificity, Functional 
homology 



Findings 

Introduction 

Helicobacter pylori is a Gram negative pathogenic bac- 
terium that infects the stomach tissue of approximately 
half the worlds population [1] and is associated with 
different gastric diseases ranging from gastritis to peptic 
ulcers and adenocarcinoma cancer [2-4]. Gastric cancer 
is the second leading cause of cancer deaths worldwide. 
Cytotoxin-associated gene product A (CagA) is a major 
virulence factor of H, pylori. Pathogenic strains of H, pylori, 
associated with the development of adenocarcinoma in 
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humans, inject CagA into gastric epithelial cells, where it 
interacts with many different host cell proteins {e,g, Abl ki- 
nases, SRC, PARlb/MARK2 kinases, Crkll, SHP-2 protein 
tyrosine phosphatase), interfering with signalling pathways 
that regulate cell growth and motility [5-10]. The CagA- 
mediated sustained deregulation of these pathways even- 
tually leads to apoptosis in gastric epithelial cells and 
cancer. 

Although the cellular effects of CagA are well- 
characterized, the structure-function relationship of this 
protein remains poorly understood. The cagA gene 
belongs to a 40 kb genetic locus called the cytotoxin- 
associated gene pathogenicity island (cag-PAI), which is 
hypothesized to have been acquired by horizontal gene 
transfer from an unrelated species [11]. In addition to 
the cagA gene, cag-PAI contains genes that encode for 
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the components of a type IV secretion system (T4SS) 
which is responsible for translocating CagA into the host 
gastric epithelial cells [12]. Previous studies by Murata- 
Kamiya and co-workers [13] showed that inhibition of 
actin polymerization impaired CagA delivery into human 
epithelial cells, indicating that CagA internalization is 
dependent on host cell machinery and involves actin poly- 
merisation. However, the mechanism by which CagA tra- 
verses the host cell membrane remains to be elucidated. 

Internalization of CagA by host epithelial cells requires 
its interaction with host membrane lipid phosphatidyl- 
serine (PS) [13] and results in localization of CagA to 
the PS -rich inner leaflet of the host cell membrane 
[13,14]. Membrane tethering is absolutely required for 
all CagA activities reported to date [6,14,15]. Interes- 
tingly, PS is physiologically present only on the inner 
leaflet of eukaryotic cell membranes; however, it has 
been shown to transiently externalize to the outer leaflet 
of the host plasma membrane at the sites of direct con- 
tact with H. pylori [14]. It is known that CagA exploits 
PS at both the outer and inner leaflets for entry into the 
host cell and localization to the plasma membrane, espe- 
cially in polarized epithelial cells [13]. 

Previous site-directed mutagenesis studies revealed 
that CagA residues R619 and R621 (strain NCTC11637 
numbering) are essential for binding to PS, uptake of 
CagA by the host cells and its association with the host 
cell membrane [13]. Analysis of the crystal structure of 
CagA fragment 1-876 revealed that the corresponding 
residues in strain 26695 (R624 and R626) are located in 
one of the a-helices (al8) of Domain II and, together 
with lysine residues at positions 613, 614, 617, 621, 631, 
635, 636 of the same a-helix, form a positively charged 
patch on the CagA surface [16]. Systematic site-directed 
mutagenesis studies revealed that these positively char- 
ged residues are involved in the CagA- PS interaction in 
addition to R624 and R626 (strain 26695 numbering) 
[16]. It has been hypothesized that the positively charged 
face of the a-helix 610-639 (alS) tethers CagA to the 
negatively charged phosphate groups of the lipid mem- 
brane via electrostatic interactions. 

To begin to understand the molecular mechanisms 
underpinning the internalization of CagA by human epi- 
thelial cells, the sequence and structural characteristics of 
CagA were analysed in comparison to those of other pro- 
teins. Local homology at the level of amino acid sequence 
and secondary structure has been identified between an a- 
helical region of CagA and the membrane-targeting region 
of the Fes-CIP4 homology-Bin/Amphiphysin/Rvs (F-BAR) 
domain of human proteins. The analysis presented here 
reveals that the homologies with F-BAR proteins extend 
to lipid binding specificities and involvement in reor- 
ganization of the actin cytoskeleton, altogether suggesting 
convergent evolution of CagA to a similar function. 



Methods 

Analysis of the amino acid sequence of CagA from H. pylori 
strain 26695 (UniprotKB P55980) using the NCBI Con- 
served Domain Architecture Retrieval Tool (CDART) 
(http://www.ncbi.nlm.nih.gov/Structure/lexington/lexington. 
cgi) [17] identified a local homology between CagA re- 
sidues 613-641 and region 231-259 within the F-BAR 
domain of human GAS7 (UniProtKB GAS7_HUMAN) 
and the corresponding region in GAS7 homologs from 
chicken (NCBI XP_415577.2), zebrafish (NCBI XP_001 
333507.2), sea squirt (NCBI XP_002123389) and African 
clawed frog (NP_001090555.1). Analysis of the local se- 
quence alignment of CagA and GAS7 over this region re- 
vealed that conserved positively charged residues (lysine, 
arginine) implicated in the binding of F-BAR domains to 
the membrane are also present (and conserved) in the 
CagA sequence. The multiple sequence alignment was 
then extended to include the sequences of other F-BAR 
proteins: proline-serine-threonine phosphatase-interacting 
protein 1 (PSTPIPl); human formin-binding protein 17 
(FBP17); and FCH domain only proteins 1 and 2 (FCHol, 
FCHo2)) using ClustalW2 (http://www.ebi.ac.uk/Tools/ 
msa/clustalw2/). Sequence-based prediction of the sec- 
ondary structure of GAS7 was performed using the 
Jpred3 server (http://www.compbio.dundee.ac.uk/www- 
jpred/) [18]. The homology model of the F-BAR domain 
of human PSTPIPl was constructed using MODELLER 
(9vl2) [19,20] based on the coordinates of the 2.3-A 
resolution crystal structure of human FCHo2 (RSCB 
PDB ID 2v0o) [21]. Structure figures were prepared 
using PYMOL [22]. 

Results 

Local sequence homology and the role of the conserved 
positively charged residues in CagA and F-BAR domains 

A similarity search based on domain architecture imple- 
mented in CDART revealed that region 613-641 of the 
amino acid sequence of H. pylori CagA shares limited 
homology with the second a-helix (a2) of the F-BAR do- 
main of the human protein GAS7. F-BAR domains are 
found in many eukaryotic proteins involved in membrane 
remodelling processes. They bind to the negatively charged 
surface of the lipid bilayer via an extensive positively 
charged patch on their surface [23-29]. Figure la shows 
local alignment of the sequences of CagA, and GAS7 and 
other representative members of the F-BAR domain sub- 
family, highlighting the residues implicated in lipid binding. 
CagA region 613-641, region 231-259 of human GAS7 
and the respective regions in GAS7 from different eukar- 
yotic species (chicken, zebrafish, sea squirt and African 
clawed frog) show significant (35%) sequence identity. 
Extending the analysis to include F-BAR domains of 
other proteins showed that, for example, CagA region 
613-641 shares 38% and 31% sequence identity with the 
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Figure 1 Phospholipid binding residues of eulcaryotic BAR 
domain proteins are conserved in H. pylori CagA. (a) Local 
sequence alignment of CagA and F-BAR domains of representative 
eukaryotic proteins. The sequences are shown for: H. pylori CagA strain 
26695 (UniprotKB P55980); growth arrest-specific protein 7 (GAS-7) from 
human (UniProtKB GAS7_HUMAN), chicken (NCBI XP_41 5577.2), 
zebrafish (NCBI XP_00 1333507.2), African clawed frog (NP_001 090555.1); 
proline-serine-threonine phosphatase-interacting protein 1 (PSTPIPl) 
from Brandt's bat (UniProtKB S7PAL8_MYOBR), human (J3KPG6_ 
HUMAN), rhesus macaque (UniProtKB H9ZF21_MACMU), black flying 
fox (UniProtKB L5JZ07_PTEAL); FCH domain only protein 2 (FCHo2) 
from human (NCBI NP_6201 37.2), Jerdon's jumping ant (Uniprot 
E2C5A9_HARSA), zebrafish (Uniprot Fl RDR3_DANRE); FCH domain only 
protein 1 (FCHol) from human (Uniprot FCH01_HUMAN); and human 
formin-binding protein 17 (FBPl 7) (UniProtKB Q96RU3.2). Residue 
numbering above the sequences corresponds to CagA. Conserved 
residues are boxed and shown in bold. Positively charged conserved 
residues and their conservative substitutions are highlighted in blue. 
Other types of semi-conserved residues and their conservative 
substitutions are highlighted in orange. The positions of the amino acid 
substitutions associated with loss of PS-binding activity in CagA are 
shown by black dots. The positions of the substitutions associated with 
loss of lipid-binding activity in F-BAR domains are shown by stars. The 
predicted secondary structure of human GAS is shown in green 
above the sequences (cylinders represented a-helices). The secondary 
structure of CagA and FCHo2 derived from their respective crystal 
structures [16,21] are shown in red and blue, and helices are labelled as 
in [16,21]. (b) Conservation of the CagA positively charged residues 
equivalent to the membrane-binding residues of BAR domains. CagA 
amino acid sequences from 44 strains were aligned (Additional flie 1) 
and the residue variability was plotted using a logo representation 
where the height of the stack indicates the sequence conservation at a 
given position, and the size of the letter denotes a residue's relative 
frequency at that position among homologues. 



corresponding regions of PSTPIPl and FCHo2, respect- 
ively. Alignment of the sequences of full-length CagA 
and F-BAR domains indicated that the detected hom- 
ology is limited to this local region and does not extend 
beyond it. However, it was striking that many positively 
charged residues found on the lipid binding face of the 
F-BAR domains are conserved in CagA (K613, K614, 
K617, K621, R624, R626, K635) and represent residues 
involved in CagA binding to lipids (Figure la). 

Previous genetic and structural studies on BAR and 
F-BAR domains have shown that their binding to phos- 
pholipid membranes is underpinned by electrostatic inter- 
actions involving conserved positively charged residues 
on their concave surface formed by helices a2 and a3 
[21,23,30]. The identified region in F-BAR domains that 
shows homology with CagA forms part of helix a2. The 
replacement of K33 in this region in FBP17 with gluta- 
mate significantly reduced its membrane binding and 
tubulation activity in vitro and abolished the membrane 
invagination induced by GFP-FBP17 in vivo [23]. The 
K33Q-HR35Q variant of this protein showed reduced 
phospholipid-binding ability and liposome tubulation [31]. 
Residues K37 and K44 (equivalent to K27 and K33 in 
FBP17) were shown to contribute to the membrane bin- 
ding of the F-BAR domain of Sypl [29]. Furthermore, the 
mouse syndapin residue R46 (equivalent to R35 in FBPl 7) 
was also implicated in membrane binding [24]. The lists 
of positively charged residues within this region that are 
implicated in membrane binding and deformation by the 
F-BAR domains of human FCHo2 and human FBPl 7 are 
given in Table 1. 

Analysis of the multiple sequence alignment between 
CagA region 613-641 and the respective region in the F- 
BAR domains of GAS7, PSTPIPl, FCHo2, FCHol and 
FBP17 (Figure la) showed that positively charged mem- 
brane-binding residues found in this region of F-BAR 
domains are present in CagA (Figure la. Table 1). Further- 
more, inspection of the alignment of CagA amino acid se- 
quences from 44 representative H, pylori strains over the 
region 613-641 (strain 26695 numbering, see Additional 
file 1) showed that all of them are either absolutely con- 
served or conservatively substituted (K/R) (Figure lb). 
CagA residues K617, K621, R624 and R626 (equivalent to 
lipid-binding residues K27, K30, K33 and R35 of FBP17) 

Table 1 F-BAR membrane-binding residues conserved 



between F-BAR and CagA 


FCHo2 


FBPl 7 


CagA 


References 


K33 


K27 


K617 


26, 29 (K37 in Sypl) 


Not conserved 


K30 


K621 


26 


R40 


K33 


R624 


23, 26, 29, 31 (K44 in Sypl) 


R42 


R35 


R626 


31 


Not conserved 


K52 


K644 


31 
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were shown to bind PS [13,16]. The observation that the 
conserved positively charged residues play a similar role in 
CagA and F-BAR domains is consistent with the func- 
tional commonality between the positively charged clus- 
ters in CagA and in F-BAR domains - both serve as a 
membrane-targeting module. 

Mapping homology regions on three-dimensional (3D) 
structure of F-BAR domains and CagA 

Mapping of the membrane-binding residues of the 
F-BAR domains and the equivalent positively charged resi- 
dues of CagA on their respective 3D structures [16,21] 
(Figure 2) reveals that although the homologous regions 
of CagA and F-BAR domains share similar (predominantly 
a-helical) secondary structure, their positions within the 



overall proteins' folds are distinctly different. The lipid- 
binding residues of the F-BAR domains reside on a three- 
helix coiled-coil structure. The region of local homology 
with CagA (blue in Figure 2) is formed by the residues of 
helix a2b. In contrast, the equivalent residues in CagA are 
located on a helix (al8) that forms part of the Domain II 
[16], comprising an extended single-layer |3-sheet and two 
helical subdomains. Furthermore, unlike CagA which is 
monomeric in vitro [16,32], F-BAR domains are obligate 
dimers (Figure 2), and the shape of their dimeric structure 
is important for their function in membrane recognition 
and bending [23-29]. As illustrated in Figure 2, the region 
in F-BAR domains that shows homology with CagA binds 
membranes as a dimer. Given these overall topological dif- 
ferences, are there common local structural features in 






CagA 

fragment 261-829 



90° 



F-BAR domain 
of FCHo2 



F-BAR domain 

of PSTPIP1 (model) 



Figure 2 The membrane-binding residues of the F-BAR domains and the equivalent residues in CagA. The structures are shown for the 
F-BAR domains from human FCHo2 (available in the Protein Data Bank (PDB) under code 2v0o [21]), human PSTPIPl (homology model generated 
using the above structure 2v0o as a template), and for CagA fragment 261-829 (PDB code 4dvz [16]). The homology region is coloured blue. The 
side chains of the membrane-binding residues of the F-BAR domains that are conserved in CagA are shown as blue sticks and labelled. The 
four-helical bundle in CagA that shares structural similarities with proteins that interact with actin-regulatory networks [33] is encircled with a 
green dashed line. The side chain of E634 in CagA is coloured red to highlight the site with positive selection [34]. 
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these regions that explain why both function as a mem- 
brane tether? Inspection of the structures and the local 
sequence alignment (Figure 1) shows that these helices 
possess an amphipathic nature with the hydrophobic face 
hidden in the protein core, and the hydrophilic face 
exposed to the solvent. The positively charged membrane- 
binding residues that are conserved between F-BAR 
domains and CagA align on the hydrophilic side, facing 
the negatively charged phosphate groups of the lipid 
membrane (Figure 2), thus promoting favorable inter- 
action between the protein and phospholipid bilayer. 

Discussion 

Membrane tethering of the bacterial-borne effector pro- 
tein CagA plays an essential role in its pathogenic acti- 
vity. Upon T4SS -mediated translocation into the host 
cell, CagA is localized to the inner surface of the cell 
membrane and phosphorylated by the membrane-asso- 
ciated Src kinases. The phosphorylated CagA recruits 
SHP-2 to the plasma membrane, where it activates SHP-2 
phosphatase activity. Activated SHP-2 then dephosphory- 
lates substrates that are also located in close proximity to 
the membrane and thereby generates signals that lead 
to morphological changes of the gastric cell. In addition to 
its role in the intracellular function of CagA, interaction 
with phospholipids, and specifically PS in the outer mem- 
brane of the host cell, is important for translocation of 
CagA across the host cell membrane [13], the mechanism 
of which remains to be elucidated. 

The interaction interface between CagA and the phos- 
pholipid membrane is known to involve several separate 
sites in the protein. The positively charged helix al8 
(residues 610-639), harboring a surface-exposed cluster of 
conserved lysine/arginine residues at positions 613, 614, 
617, 621, 624, 626, 631, 635 and 636, is believed to tether 
CagA to the negatively charged phosphate groups of the 
lipid membrane via electrostatic interactions [13,16]. 
Although the first 200 amino acids of CagA have been 
shown to be sufficient for membrane tethering [35], 
regions 200-800 and 800-1216 were subsequently shown 
to also be important for membrane binding [36], leading 
to a hypothesis that two separate domains within the 
C-terminal region, spanning residues 200-800 and 800- 
1216, interact in trans to mediate interactions with the 
host cell membrane. 

The analysis presented here reveals a previously un- 
suspected similarity between the membrane-tethering 
helices of CagA and eukaryotic F-BAR domains, thus 
providing a new insight into the molecular mechanisms 
underpinning interaction of CagA with lipid membranes 
of human epithelial cells. The discovery that, despite the 
low overall sequence identity and distinctly different 
protein folds, many positively charged residues found 
on the lipid binding face of the F-BAR domains are 



conserved in CagA and represent residues involved in 
CagA binding to lipids suggests that the effector protein 
CagA acquired a similar function through convergent 
evolution. In line with this finding, CagA and F-BAR 
domains have similar lipid specificity profiles. All BAR 
superfamily members, including F-BAR domains, bind 
to the plasma membrane through electrostatic interac- 
tions with negatively charged phospholipids, showing 
high affinity to PS and phosphoinositides such as phos- 
phatidylinositol (PI) 4,5-bisphosphate (PI(4,5)P2) and PI 
3,4,5-triphosphate (PI(3,4,5)P3) [21,27,30,31]. Similarly, 
H, pylori CagA strongly binds PS and phosphoinositides, 
including PI 3-phosphate (PI3P), PI4P, PI5P, PI(3,4)P2, 
PI(3,5)P2 and PI(4,5)P2 [16]. Interaction of CagA with 
the host membrane PS, which is aberrantly externalized 
at the site of H, pylori attachment, plays an essential role 
in the translocation of CagA across the host cell mem- 
brane and subsequent CagA localization to its inner leaf, 
which is central to the pathophysiological activity of this 
protein [13]. The CagA region of homology to F-BAR 
domains (amino acids 613-641) resides entirely within 
the boundaries of the PS -binding domain mapped by 
previous studies [14,16]. Many of the positively charged 
residues important for the PS binding by CagA (K613, 
K614, K617, K621, R624, R626, K631, K635 and K636) 
are conserved in F-BAR domains, where their role is also 
to specifically recognise PS and phosphoinositides. This 
supports the notion of a functional commonality between 
these positively charged clusters in CagA and in F-BAR 
domains which convergently evolved as eukaryotic mem- 
brane-targeting modules. 

This conclusion is in line with a recent study of the evo- 
lution of the cagA gene by Furuta et al. [34] which re- 
vealed that region 613-641 contains a site (amino acid 
634) that has undergone positive selection. The side chain 
of the residue at this position points towards the putative 
interface with the negatively charged membrane surface 
(Figure 2). The local sequence alignment (Figure 1) shows 
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Figure 3 Domain architecture of the representative F-BAR proteins. 

GAS7, growth arrest-specific protein 7; PSTPIPl, proline-serine- 
tlireonine pliospliatase-interacting protein 1; FCHol,2, FCH domain 
only proteins 1 and 2; FBP17, formin-binding protein 17; CIP4, 
Cdc42 interacting protein 4. 
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that, in contrast to CagA from H. pylori strain 26695 
where this position is occupied by the acidic (glutamate) 
residue, many F-BAR domains have a small residue at this 
site (alanine, glycine, serine), which would be more 
favourable for the interaction with the negatively charged 
membrane surface. From this point of view, it is important 
to note that, as shown by Furuta et al [34], in the course 
of adaptive evolution, many Eastern (more pathogenic) 
strains of H, pylori have also acquired a small residue 
(alanine or valine) at this position, whereas a significant 
proportion of Western (less pathogenic) strains have glu- 
tamate. This observation raises an interesting possibility 
that CagA in more pathogenic strains evolved to bind to 
membranes with higher affinity. This hypothesis should 
be tested experimentally in future. 

Further functional parallels between the bacterial ef- 
fector protein CagA and eukaryotic F-BAR domain con- 
taining proteins can be drawn when one considers their 
respective mechanisms of action. Members of the F-BAR 
domain protein subfamily are typically linked to reor- 
ganization of the actin cytoskeleton [21,27,31,37,38]. In 
addition to the membrane-binding F-BAR domain, they 
usually contain other domains {e, g, SH3, WW, MHD, 
HRl (Figure 3)) that interact with actin-regulatory net- 
works. These proteins are often found at endocytic sites 
where they mediate interplay between membrane dynam- 
ics and cytoskeletal components by binding to the neck of 
the endocytic vesicle via the F-BAR domain and recruiting 
via the other domain, factors that initiate actin polyme- 
risation for vesicle budding. One of the well-characterized 
biological functions of H, pylori CagA is the drastic 
change of the morphology of gastric cells (elongation) 
caused by CagA-mediated deregulation of the actin cyto- 
skeleton [39]. Thus, like eukaryotic F-BAR domain pro- 
teins, H, pylori CagA function appears to be linked to 
actin dynamics. Similar to the F-BAR proteins, CagA con- 
tains a subdomain (four-helical bundle located at the end 
of helix al9 and comprising helices al9-a22, (Figure 2)) 
that shares structural similarities with proteins that inter- 
act with actin-regulatory networks, such as the F-actin 
binding domain of the Bcr-Abl tyrosine kinase, a-catenin 
and vinculin [33]. The uncovered similarities between the 
bacterial effector protein CagA and eukaryotic F-BAR pro- 
teins that are implicated in endocytosis suggests conver- 
gent evolution of CagA towards a similar function and 
raises the question of whether secreted CagA can facilitate 
its own uptake into human epithelial cells via an endo- 
cytosis-like process. 

Additional file 



Additional file 1: Alignment of CagA amino acid sequences from 44 
different H. pylori strains over the region 613-641 (strain 26695 
numbering). 
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